GermanPolarityClues: A Lexical Resource for German Sentiment Analysis

نویسنده

  • Ulli Waltinger
چکیده

In this paper, we propose GermanPolarityClues, a new publicly available lexical resource for sentiment analysis for the German language. While sentiment analysis and polarity classification has been extensively studied at different document levels (e.g. sentences and phrases), only a few approaches explored the effect of a polarity-based feature selection and subjectivity resources for the German language. This paper evaluates four different English and three different German sentiment resources in a comparative manner by combining a polarity-based feature selection with SVM-based machine learning classifier. Using a semi-automatic translation approach, we were able to construct three different resources for a German sentiment analysis. The manually finalized GermanPolarityClues dictionary offers thereby a number of 10, 141 polarity features, associated to three numerical polarity scores, determining the positive, negative and neutral direction of specific term features. While the results show that the size of dictionaries clearly correlate to polarity-based feature coverage, this property does not correlate to classification accuracy. Using a polarity-based feature selection, considering a minimum amount of prior polarity features, in combination with SVM-based machine learning methods exhibits for both languages the best performance (F1: 0.83-0.88).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SentiWS - A Publicly Available German-language Resource for Sentiment Analysis

University of Leipzig, Natural Language Processing Department, Johannisgasse 26, 04081 Leipzig, Germany [email protected], {quasthoff, heyer}@informatik.uni-leipzig.de Abstract SentimentWortschatz, or SentiWS for short, is a publicly available German-language resource for sentiment analysis, opinion mining etc. It lists positive and negative sentiment bearing words weighted within the...

متن کامل

Prior Polarity Lexical Resources for the Italian Language

In this paper we present SABRINA (Sentiment Analysis: a Broad Resource for Italian Natural language Applications) a manually annotated prior polarity lexical resource for Italian natural language applications in the field of opinion mining and sentiment induction. The resource consists in two different sets, an Italian dictionary of more than 277.000 words tagged with their prior polarity value...

متن کامل

Harmonization of German Lexical Resources for Opinion Mining

We present on-going work on the harmonization of existing German lexical resources in the field of opinion and sentiment mining. The input of our harmonization effort consisted in four distinct lexicons of German word forms, encoded either as lemmas or as full forms, marked up with polarity features, at distinct granularity levels. We describe how the lexical resources have been mapped onto eac...

متن کامل

Towards Building a SentiWordNet for Tamil

Sentiment analysis is a discipline of Natural Language Processing which deals with analysing the subjectivity of the data. It is an important task with both commercial and academic functionality. Languages like English have several resources which assist in the task of sentiment analysis. SentiWordNet for English is one such important lexical resource that contains subjective polarity for each ...

متن کامل

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010